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Abstract 

The ILU-based preconditioning methods in previous work have their 
own parameters to improve their performances. Although the parame- 
ters may degrade the performance, their determination is left to users. 
Thus, these previous methods are not reliable in practical CAE use. This 
paper proposes a novel ILU-based preconditioner named auto-accelerated 
ILU, or A ILU. In order to improve the convergence, A ILU introduces 
acceleration parameters which modifies the ILU factorized precondition- 
ing matrix. A^ILU needs no more operation than the original ILU be- 
cause the acceleration parameters are optimized automatically by A'^ILU 
itself. Numerical tests reveal the performance of A^ILU is superior to 
previous ILU-based methods with manually optimized parameters. The 
numerical tests also demonstrate the ability to apply auto-acceleration 
to ILU-based methods to improve their performances and robustness of 
parameter-sensitivities. 



1 Introduction 

As a means of fast solving "sparse" linear systems which can be the most time 
consuming part in many physical simulations, a preconditioned iterative method 
is one of the most frequently used computational methods. Among many exist- 
ing preconditioning methods, incomplete LU preconditioning is highly regarded 
for its generality in application because it can be applied to arbitrary matrices 
with non-zero entry structures. In particular ILU with no fill-in, denoted by 
ILU(O) [15' which is the most basic form of the ILU preconditioning requires no 
other information except the equations themselves, i.e., it is a "parameter- free 
method," making it extremely practical. Because of these properties, it has 
achieved success in a wide variety of physical simulations. 

Several ILU-based methods have been proposed to improve the computa- 
tional performance of ILU(O) [31 131 [71 [TB]. These are classified into two types, 
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one changes only the non-zero entry values in the preconditioning matrices ob- 
tained by ILU(O) and the other changes both the non-zero entry structure and 
their values. The first type includes shifted ILU [H] and modified ILU 
Shifted ILU performs ILU factorization on a matrix obtained by shifting the di- 
agonal entries. Modified ILU subtracts the products of a relaxation parameter 
and fill-ins to be discarded from the diagonal entries to reduce the approxima- 
tion error. The second type includes ILU with k extra diagonals [9], fill-in level 
ILU [20], and ILUT [17] as well as Grout ILU [13]. The ILU with k extra diag- 
onals chooses whether to allow or discard fill-ins, depending on the position in 
the discrete space. The fill-in level ILU is the extended version of this method 
for general sparse matrix. ILUT and Grout ILU allow fill-ins only within certain 
ranges set up for the number of non-zero entries and those values. The first type 
methods can be applied to other ILU-based methods. 

Each of these ILU-based methods obtains the improvement of the perfor- 
mance by introducing unique parameters. The performance is, however, de- 
graded by setting inadequate parameter values. Since none of these methods 
sets up the parameters automatically, this should be left to users. A straight- 
forward way to optimize the parameters is brute force searching over a set of 
candidates. Here we denote the number of candidates as Ns- This type of 
optimization needs to solve a given linear equation Ns times. This approach is, 
therefore, available only when we solve the equation with the same coefficient 
matrix for a different RHS vector much more than Ns times. 

At workplaces where GAE (Gomputer- Aided Engineering) is used, a wide 
variety of physical simulations is used to describe various physical phenomena 
by different modeling methods. Even for one physical simulation, calculations 
are carried out for different conditions and constraints, depending on the com- 
putational size as well as the physical properties and structures of the materials. 
Since the different conditions and constraints change coefficient matrices, one 
needs to solve a large number of different linear systems. Hence, to use the ILU- 
based methods listed above, one has no choice but to set up the same parameter 
value predetermined empirically for each linear system. However, since every 
linear system has different optimum parameter values, these methods cannot 
deliver high performance and may sometimes produce solutions that diverge. 
In summary, the previous ILU-based methods have traded in the "parameter- 
free" advantage of ILU(O) preconditioning to improve performance, and as a 
result they have significant problems in practice. 

In recent years, new indices to evaluate the effect of ILU preconditioning 
have been proposed [3 13 El [H]. These indices are some functions of ILU- 
factorized matrices. However, They cannot be described as explicit functions 
of the parameters because the parameters directly change not only the values 
of entries but also the sequence of operations in ILU factorization. Therefore, 
they could be available only for brute force optimization as well as the previous 
ILU-based methods described above. 

More sophisticated methods have been proposed to determine the ordering 
of the ILU factorization using some indices calculated from the matrices being 
processed [5]. The indices are called Minimum Update Matrix (MUM) and 
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Minimum Discarded Fill (MDF). Because these self ordering processes take 
tremendously long time, the total calculation time is much increased in almost 
all cases. Therefore, these methods are not practical except the special usage 
same as described above. 

With this background, in this paper we attempt to improve ILU precon- 
ditioning performance without losing practicality for physical simulations |16| . 
Firstly, we introduce to ILU preconditioning new acceleration parameters which 
make themselves easy to be optimized automatically. We then describe a mech- 
anism which automatically optimizes these acceleration parameters and propose 
the process as auto-accelerated ILU preconditioning or A^ILU for short. A^ILU 
can be also applied to the previous ILU-based methods including Shifted ILU, 
Modified ILU, Fill-in level ILU, and Grout ILU. 

The rest of the paper is organized as follows. Section 2 gives an overview of 
ILU preconditioning. Section 3 explains the basics of A^ILU preconditioning; 
acceleration parameters for ILU preconditioning are introduced first, followed 
by an explanation of the mechanism that automatically optimizes the acceler- 
ation parameters. In Section 4, the performance of A^ILU preconditioning is 
evaluated by numerical experiments. In Subsection 4.1, we evaluate the per- 
formance of A^ILU(O) preconditioning with respect to linear systems obtained 
by discretization of the PDE on rectangular grids and validate its generality, 
practicality, and scalability. We also consider applying the auto-acceleration to 
the major ILU-based methods and validate the results. In Subsection 4.2, we 
evaluate the performance of shifted A^ILU(O) (shifted ILU(O) with the auto- 
acceleration) preconditioning for general sparse matrices by using more than 
200 sample matrices obtained from the University of Florida Sparse Matrix 
Collection [4 . 



2 ILU preconditioning for sparse matrices 

In this section we present a brief introduction of ILU preconditioning before 
proposal of our method. 

Consider a system of linear equations whose coefficients are given by a sparse 
matrix A, denoted as follows: 

Ax = b. (1) 

When such a system is solved by an iterative method, the convergence strongly 
depends on the property of the coefficient matrix. It can be expected that the 
number of iterations decreases as a coefficient matrix tends to be close to an 
identity matrix. Preconditioning re-constructs the system as equation ©, 

{K^'AK^'){K2x)=K^'b, (2) 

M^KiK2. (3) 

Here, the matrix M in equation Q is called preconditioning matrix. The closer 
M is to A, the closer K^^AK2^ (the coefficient matrix after the reconstruc- 
tion) is to the identity matrix, thus improving the convergence significantly. 
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ILU preconditioning performs LU factorization on coefficient matrix A in an 
incomplete way and uses the result as a preconditioning matrix. LU factoriza- 
tion completely decomposes coefficient matrix A into the product of a strictly 
lower-triangular matrix L, a diagonal matrix D, and a strictly upper-triangular 
matrix U. 

A={L + D)D~\D + U). (4) 

On the other hand, incomplete LU (ILU) factorization keeps a certain degree 
of sparsity in these matrices by discarding part of the fill-in in the course of the 
factorization process. Some sophisticated ILUs such as CroutlLU discard not 
only fill-in but also the updated original entries. In particular, ILU(O) requires 
that these matrices inherit the non-zero entry structure of coefficient matrix A 
by discarding all fill-ins. Let L and U denote the strictly lower-triangular matrix 
and the strictly upper-triangular matrix thus obtained, respectively. Using these 
matrices, the ILU-preconditioning matrix M can be expressed as follows: 

M = {L + D)D-\D + U). (5) 

Neither L and L nor U and U generally coincide, so there is a difference between 
matrices A and M . The difference between A and M is called remainder matrix 
R as 

R^A-M. (6) 

which is used later to evaluate the approximation accuracy of a preconditioning 
matrix. 



3 Auto-accelerated ILU preconditioner 

In this section we propose the auto-acceleration of ILU preconditioning. Our 
new acceleration parameters are used to change only non-zero entry values after 
ILU factorization. These parameters can be optimized automatically because 
the ILU factorized matrix can be described as an explicit function of them. 

3.1 Introduction of acceleration parameters 

In order to improve the computational performance, ILU preconditioning is 
requested to reduce both the number of iterations and the operation count per 
iteration step. The former can be obtained by the reconstruction of a coefficient 
matrix shown in equation ^ . The latter can be given by keeping the sparsity of 
a coefficient matrix. However, the sparsity degrades the approximation accuracy 
of the preconditioned matrix resulting in increase of the number of iterations. 

To achieve a balance between the two factors, ILU preconditioning performs 
incomplete LU factorization on coefficient matrix A under some sort of con- 
straints related to the structure of the non-zero entries. However, the effects of 
the constraints on the accuracy of ILU factorization are quantitatively unknown. 
There is no guarantee that the preconditioning matrix obtained by ILU factor- 
ization has an optimum level of approximation for coefficient matrix A while 



A^ILU: AUTO-ACCELERATED ILU PRECONDITIONER 



5 



preserving its non-zero entry structure. We focus on this point and attempt to 
improve the approximation accuracy of coefficient matrix A by modifying only 
the values (without changing the non-zero entry structure) of the precondition- 
ing matrix obtained by ILU factorization. If such an attempt is successful, the 
number of iterations can be reduced without increasing the computational time 
for each iteration process, ensuring faster computation without any trade-off. In 
our proposed method, we introduce distinct scalar parameters for the matrices 
obtained by ILU(O) or the other ILU-based methods: (j) for strictly triangular 
matrices L and J7, and 7 for diagonal matrix D, 

M ^ {<j>L + -fD){jD)-\jD + (t)U). (7) 

In the rest of this paper, we refer to 4> and 7 as acceleration parameters. 



3.2 Automatic optimization of the acceleration parame- 
ters 

We discuss the mechanism which automatically optimizes the acceleration pa- 
rameters. From the discussion in the previous section, minimizing the approx- 
imation error of the preconditioning matrix relative to the coefficient matrix 
is thought to be equal to improving the convergence of the iterative method 
maximally. Therefore, if we express the approximation error with some kind of 
objective function of R, we can optimize the acceleration parameters automat- 
ically with gradient-based methods because the objective function of R can be 
written explicitly as a function of these acceleration parameters. 

We adopt the squared Euchdean norm of Re, where e = (1, 1)-^ as the 
objective function of R, 

n—1 n—1 n—1 

j=0 i=0 j=0 

Here, n is the size of the matrix, i is the index for the rows, and j for the 
columns. The objective function is based on an idea of modified ILU in which 
the condition number, and consequently the number of iterations, is reduced by 
minimizing (A — M)e = Re for solving an elliptic PDE. 

The objective function f{R) is written as a non-linear explicit function of (j) 
and 7, 

f{R) (9) 

n-l /n-1 / min[ij"]-l 
= ^ ^ fly - (t>kj ~ idij ~ (f>Uij - Cjy'^j'^ ^ kkdkk^^Ukj 

i=o \j=o y fc=o 

The optimum values of the acceleration parameters are obtained by mini- 
mizing equation ([TOl) . When we use Newton- Raphson method to optimize, an 
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update equation for the parameters is given as 



■ 0(t+l) ■ 













(10) 



Here, H and g are, respectively, a Hessian matrix and a gradient vector defined 

by 



H 



{dVd-fdcj>)f{R) {dydj')fiR) 

g = [ iid/d^)f{R) {d/dj)fiR) ] 



(11) 
(12) 



A pair of the acceleration parameters and 7 in equation ^ and the opti- 
mization of them by minimizing the f{R) in equation (jS)) is referred to as auto- 
acceleration. Since this auto-acceleration is applied to a matrix described in 
equation ([5]) , it is expected that the auto-acceleration is incorporated with any 
other ILU-based methods. Applications of the auto-acceleration to major ILU- 
based methods are validated by numerical experiments as well as ILU(O). In 
the rest of this paper, the ILU-based preconditioning method applied the auto- 
acceleration is expressed by replacing ILU with A^ILU. For example, shifted 
ILU(O) with the auto-acceleration is referred to as shifted A^ILU(O). 



3.3 Computational cost of auto-acceleration 

Here we evaluate the computational cost of the auto-acceleration by comparing 
with modified ILU factorization (MILU for short). It can be helpful for the 
evaluation to divide the computational cost into three parts, i.e., cost of loading 
from memory, operating on CPU, and storing into memory. Hereafter, we refer 
to these three costs as loading cost, operating cost, and storing cost, respectively. 

In the auto-acceleration, the matrix-matrix product hkd^^Ukj in equation 
(fTO|) dominates its computational cost. On the other hand, MILU performs 
following operations, 

min[z,j] — 1 

hj+dij+Uij = Qij - ^ hkdkk^^Ukj, a e P, (13) 

k=0 
miii[?,j] — 1 

da = dii-u! ^ ZifcdjTj^Ufcj, otherwise. (14) 

k=Q 

where P is a set of indices with non-zero entries of the resultant preconditioned 
matrix. MILU updates L, D, and U by using equation(|13|) with an additional 
modification in equation (fT4|) . We can see that both equations are also dom- 
inated by matrix-matrix product. Since the matrix-matrix product must be 
done for all non-zero elements both in both methods, their loading and operat- 
ing costs are, respectively, identical. 

As for the storing cost, we can see a difference between these methods. The 
auto-acceleration stores only two scalar acceleration parameters, (j) and 7 and 
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a constant number of additional variables temporarily used in equations (|10p 
to p2)) . while MILU stores all non-zero elements of three matrices, L, D, and 
U. Therefore, the storing cost of MILU is extremely larger than that of the 
auto-acceleration for large practical problems. 

The computational cost of a simple ILU is less than MILU by the matrix- 
matrix product of the additional modification in eauation dTi)) . The comparison 
between the auto-acceleration and ILU is much more complicated than the case 
of the auto-acceleration and MILU. Computational cost of the auto-acceleration 
in sample problems will be compared with that of whole A^ILU preconditioned 
iterative method in later sections. 

3.4 Constraint on the acceleration parameters 

By preliminary numerical experiments, we found that the parameters satisfying 
^ / (f) > 1.0 degraded the computational performance in almost all cases. Al- 
though the theoretical aspect of this problem is under investigation, we employ 
the condition < 1.0 as an empirical constraint on the parameter values. 
In numerical experiments described in the subsequent section, the acceleration 
parameters of all auto-accelerated methods are subject to this constraint. 

4 Numerical experiments 

The proposed method is validated by numerical experiments. Numerical exper- 
iments described below are classified into two groups according to the type of 
coefficient matrix used. The first one uses multi-diagonal sparse matrices given 
by discretizing partial differential equations (or PDEs for short) on rectangular 
grids, while the second one general sparse matrices obtained from the University 
of Florida Sparse Matrix Collection. Practicality and performance of A^ILU and 
auto-accelerated ILU-based methods are evaluated in the first experiment. The 
second experiment shows mainly the effect of the auto-acceleration on shifted 
ILU(O). 

4.1 Performance evaluation for linear systems arising from 
rectangular grids 

In this subsection we evaluate the performance of A^ILU preconditioning for 
systems of linear equations obtained through discretization of PDEs on rect- 
angular grids. We also validate its effectiveness from various aspects. Here, 
the coefficient matrices of the system of equations will be multi-diagonal (e.g., 
tri-diagonal, penta-diagonal) matrices where multiple arrays of non-zero entries 
are lined up on and along the main diagonal. 

4.1.1 Generality for various types of physical simulations 

Here we validate the generality of A^ILU(O) preconditioning using linear systems 
derived from five different types of physical simulation. Table[l]shows the details 
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of each physical simulation. "Type of PDE" and "Stationary" refer to the 
specifications of PDEs obtained by modeling physical phenomena. These PDEs 
were discretized on rectangular grids using the method described in the column 
"Discretization" to produce linear systems. Table [2] shows the details of the 
coefficient matrix of the linear system and the iterative method used. The size 
of the coefficient matrix is denoted by 'n,' and 'nnz/row' refers to the number of 
non-zero entries for each row. The convergence criterion of the iterative methods 
is given by Hrlp/H&p < e, where r is the recursive residual vector calculated by 
the recurrence formula, rk+i = — atApk in CG method. The threshold value 
e is determined to avoid pseudo-convergence so that ||s|p/||r|p < 2 is satisfied 
until the iterative method is terminated, where s is the true residual vector 
calculated by Sfe = 5 — Axk- In each problem we carry out diagonal scaling first 
to normalize the coefficient matrix so that every diagonal entry is 1. Table [3] 
summarizes the details of the computation environment used. 



Table 1: Physical simulations examined. 



No. 


Physics 


Type of PDE 


Stationary 


Discretization 


1 


Incompressible fluid 


Poisson 


Yes 


FVM 


2 


Heat conduction 


Poisson 


No 


FVM 


3 


Light diffusion 


Helmholtz 


Yes 


FEM 


4 


Heat radiation 


Hclmholtz 


No 


FVM 


5 


Charge transfer 


Advection-diffusion 


No 


FVM 



Table 2: Properties of coefficient matrices and iterative method used. 



No. 


n 


nnz / row 


Symmetry 


Solver 


e 


1 


537600 


7 


Yes 


CG 


l.Oe-11 


2 


720000 


7 


Yes 


CG 


l.Oc-12 


3 


410913 


27 


Yes 


CG 


l.Oe-12 


4 


647168 


7 


Yes 


CG 


l.Oc-11 


5 


884736 


7 


No 


BiCGSTAB 19 


l.Oe-9 



Table 3: Calculation environment. 



System 


NEC LX118TC-4G 


CPU 


Intel Xeon X5670 
2.93GHz / L3-Cache 12MB 


Memory 


DDR3-1333 48GB 


Compiler 


Intel C++ Ver. 11. 1.075 



Each problem was solved using both ILU(O) preconditioning and A^ILU(O) 
preconditioning. Table |4] shows the results of solving the linear system involved 
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in each physical simulation. "Itr." denotes the number of iterations required for 
convergence. "Time" denotes the number of seconds required for convergence, 
in A^ILU(O) the time for the auto-acceleration is shown in brackets, cj) and 7 in 
A^ILU(O) are the values optimized by the auto-acceleration process. "Speed-up 
ratio" is the quotient obtained when the number of iterations or the computa- 
tional time for ILU(O) is divided by the corresponding value for A^ILU(O); they 
show how much the performance is improved by the auto-acceleration. 



Table 4: Performance of ILU(O) and A^ILU(O) for sample problems. 



No. 


ILU(O) 


A2iLU(0) 


Speed-up 
ratio 


Itr. 


Time 


Itr. 


Time 





7 


Itr. 


Time 


1 


168 


4.34 


96 


2.59 (2.53C-2) 


1.02 


0.66 


1.75 


1.68 


2 


125 


6.50 


72 


3.95 (5.10C-2) 


1.52 


0.98 


1.74 


1.65 


3 


81 


6.94 


46 


4.76 (8.88e-2) 


1.13 


0.80 


1.76 


1.46 


4 


191 


8.41 


97 


4.49 (4.46e-2) 


1.53 


0.98 


1.97 


1.87 


5 


93 


10.51 


55 


6.50 (7.18e-2) 


0.80 


0.51 


1.69 


1.62 



According to Tabled performance was improved for every problem when the 
auto-acceleration was used. The average speed-up ratio for iterations was 1.78 
with the range of [1.69,1.97]. The average speed-up ratio for time was 1.65 with 
the range of [1.46,1.87]. Meanwhile, the acceleration parameters (j) and 7 were 
optimized to different values for each problem. The optimum values of these 
acceleration parameters differ depending on the physical phenomena involved 
and the type of PDEs used. The time of the auto-acceleration is just only 1-2% 
of total computing time of A^ILU(O). 

4.1.2 Comparison between A^ILU(O) and ILU-based methods 

In this subsection we compare A^ILU(O) preconditioning with the various ILU- 
based methods that have been previously proposed and validate the superiority 
of A^ILU(O) preconditioning. We chose the following four as major valid meth- 
ods. 

1. Shifted ILU 

ILU factorization in this method is performed for the matrix A obtained 
by shifting the diagonal entries of coefficient matrix A. ILU corresponds 
to the shift parameter a — 0. We use the Shifted ILU(O). 

A = A + adiag{A). (15) 

2. Modified ILU 

In this ILU factorization, the fill-in being discarded is multiplied by a 
relaxing parameter w, and the product is subtracted from the diagonal 
entry of that row. ILU corresponds to the case where lu — 0. We use the 
Modified ILU(O). 
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min[i,j] — 1 

dii = da -uj ^ kkdkk~^Uk] if « ^ j, and a,., = 0. (16) 

k=0 

3. Fill-in level ILU 

In this ILU factorization, fill-ins are allowed as long as the fill-in level 
is below or equal to p. The fill-in level at entry is updated during 
ILU factorization as the non-zero entries are updated. The initial value of 
the fill-in level is given by equation (fT7|) and is updated by equation (fT8|) . 
When p = 0, it amounts to the ILU(O). 



if fly = 0,or i = j 
oo otherwise. 



levij = mhi{levij, levik + levkj + 1}. (18) 

4. Grout ILU 

In this ILU factorization, L is accessed by columns and U by rows in the 
same way as Grout LU factorization. Any fill-in whose impact to L^^ 
or is less than a tolerance is discarded and the number of fill-ins in 
each column of L or each row of U are limited. The limit is obtained 
by nnz/2n * m, where m is a ratio of maximum fill-in. When the drop 
tolerance tol — and m = oo, it amounts to the complete LU factorization. 

We have five parameters to be specified before running methods, shift pa- 
rameter a, relaxing parameter lo, fill-in level p, drop tolerance tol, and ratio of 
maximum fill-in m. Since their optimum values are unknown, each method is 
performed for a set of candidate values of parameters as follows, 

a = -0.4 + 0.1j,j e {0,1,...,10} 

UJ = -0.5 + 0.1j,j e {0,1,...,16} 

P = {1,2,3} 

m = {1,2,5,10} 

tol = {0.001,0.002,0.004,0.01,0.02,0.04,0.1,0.2} 

We used the same sample problems and computation environment as in Sub- 
section 4.1.1. Tables [U [21 and [3] show the details. Figures [T] through [5] show 
the results of Sample Problems 1 through 5, respectively by all methods ex- 
cept Grout ILU. From the left side along the horizontal axis, the graph shows 
ILU(O), shifted ILU(O), modified ILU(O), and fill-in level ILU, in that order. 
The values on the horizontal axis indicate parameters a, uj, and p for the re- 
spective methods. The vertical axis represents the computational time required 
for convergence (in seconds). The light and dark gray bars show the results 
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Figure 1: Result of sample problem No.l by all methods except Grout ILU. 
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Figure 2: Result of sample problem No. 2 by all methods except Grout ILU. 



of previous methods and auto-accelerated previous methods, respectively. The 
leftmost light and dark gray bars are, respectively, ILU(O) and A^ILU(O). 

Figures |6] through [10] show the results of Sample Problems 1 through 5, 
respectively by Grout ILU. The horizontal axis represents drop tolerance tol 
and ratio of maximum fill-in m. The vertical axis represents the computational 
time required for convergence in seconds. The results by Grout ILU and by 
Grout A^ILU are shown in the left and right graph, respectively. The dark gray 
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Figure 3: Result of sample problem No. 3 by all methods except Grout ILU. 
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Figure 4: Result of sample problem No. 4 by all methods except Grout ILU. 



bars in the results of Grout A^ILU denote the cases when the auto-acceleration 
reduces the computing time of Grout ILU. 

First, we compare these results of A^ILU(O) preconditioning and the previ- 
ous ILU-based methods from the standpoint of practicality. In each of the four 
ILU-based methods, the value of the parameter drastically influences the perfor- 
mance. See light gray bars for Shifted ILU(O), modified ILU(O), and fill in level 
ILU in Figures [1] through [5] See the left-hand graphs in Figures [SI through [TUl for 
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Figure 5: Result of sample problem No. 5 by all methods except Grout ILU. 



Grout ILU. If the parameter is optimized, the computations by all these ILU- 
based methods take less time to converge than ILU(O). However, the farther 
the parameter is from its optimum value, the larger the computational time. In 
several cases it takes more time to converge than the ILU(O), and in some cases 
the residual norm diverges, resulting in no convergent solutions. To avoid such 
unacceptable situations, the parameter must be optimized in advance. However, 
the optimum value of each parameter differs from problem to problem. For in- 
stance, the optimum value of the relaxing parameter lo in modified ILU varies 
from 0.3 to 1.0 depending on the problem. The parameters must therefore be 
optimized individually for each problem. However, it is essentially impossible to 
optimize the parameters in such an efficient way that it will not adversely affect 
the performance improvement made over the ILU(O). For instance in Figure [1] 
we obtain the optimum values of a for shifted ILU(O) by brute force searching 
which takes the sum of the computational times over the set of candidate pa- 
rameter values, more than 10 times that of ILU(O) preconditioning. A^ILU(O) 
preconditioning has none of these problems since the acceleration parameters 
are automatically optimized in a short time. 

Next, we compare A^ILU(O) preconditioning and the previous methods from 
the standpoint of performance. According to Figures [1] to [TOl A^ILU(O) pre- 
conditioning takes less time for convergence than any of the previous methods 
in every trial result (Figures [T] to [5j all light gray bars, Figures |6] to [101 all 
left-hand graphs) except for a few cases (modified ILU(O) with u = 1.0 in Fig- 
ure H shifted ILU(O) with a = -0.3 in Figure S]). This means that A2lLU(0) 
preconditioning is better than almost all cases even if the optimum value of the 
parameter for each previous method can be predetermined by any means. 
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Figure 6: Result of sample problem No.l by Crout ILU. The dark gray bars in 
the results of Crout A^ILU denote the cases when the auto-acceleration reduces 
the computing time of Crout ILU. 



CroutlLU Crout A2|LU 




Figure 7: Result of sample problem No. 2 by Crout ILU. The dark gray bars in 
the results of Crout A^ILU denote the cases when the auto-acceleration reduces 
the computing time of Crout ILU. 



The above results show that A^ILU(O) preconditioning is superior to the 
previous ILU-based methods, from the standpoints of both performance and 
practicality for multi-diagonal matrices obtained from PDEs. 
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CroutlLU CroutA2|LU 




Figure 8: Result of sample problem No. 3 by Crout ILU. The dark gray bars in 
the results of Crout A^ILU denote the cases when the auto-acceleration reduces 
the computing time of Crout ILU. 



CroutlLU Crout A2|LU 




Figure 9: Result of sample problem No. 4 by Crout ILU. The dark gray bars in 
the results of Crout A^ILU denote the cases when the auto-acceleration reduces 
the computing time of Crout ILU. 
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CroutlLU CroutA^ILU 




m 



Figure 10: Result of sample problem No.5 by Crout ILU. The dark gray bars in 
the results of Crout A^ILU denote the cases when the auto-acceleration reduces 
the computing time of Crout ILU. 



4.1.3 Application of auto-acceleration to previous ILU-based meth- 
ods 

In this subsection we validate the effects when the auto-acceleration is incorpo- 
rated into the previous ILU-based methods. In Figures [T] through [5] the dark 
gray bars show the results when the auto-acceleration was applied to shifted 
ILU(O), modified ILU(O), and fill-in level ILU. The auto-acceleration for these 
methods decreases computational time in every problem regardless of the values 
of their original parameters. 

When the auto-acceleration was applied to shifted ILU(O) and modified 
ILU(O), the parameter responsiveness became rather consistent, except for ex- 
tremely expensive computational time consuming cases where some diagonal 
elements approach zero too closely to improve by the auto-acceleration. Note 
that modified ILU(O) with w = 1 minimizes the objective function f{R) = 
so that the auto-acceleration is fruitless. For most parameter values, the con- 
vergence took less time than the ILU(O). Hence, even without optimizing their 
original parameters for each problem, the performance was improved over the 
ILU(O). This suggests that the auto-acceleration improves the practicality of 
shifted ILU(O) and modified ILU(O) as well. 

Figures [6l through [TOl show the effect of the auto- acceleration on Crout ILU. 
The left and the right figures depict the results before and after application of 
the auto-acceleration, respectively. For all sample problems shown in Figures [6] 
through I10[ the auto-acceleration decreases the computing time for greater m 
and less tol while the smallest m or greater tol gives worse results. Considering 
the roles of m and tol, it can be expected that a parameter set of greater m and 
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less tol makes a preconditioned matrix denser. 

4.1.4 Scalability for the coefficient matrix size 

In this subsection we validate the scalability of A'^ILU(O) preconditioning with 
respect to the number of unknowns in the system of linear equations, i.e., the 
size of the coefficient matrix. We used a Dirichlet boundary- value problem of a 
3-dimensional Poisson equation as a sample problem, 

-V • (kVm) = / (19) 

in 1] = (0,1)3, 

u{x, y, z) = on dfl, 

, , f 10^ if i < < I 

K(x,y,z) = i ^ ~ • ~ ^ 

^ ^ [0 otherwise, 

f{x,y,z) ^x + y + z. 

For discretization of the equations, we used a rectangular grid described as 
before. The solution vector u was initialized to be 0. The convergence criterion 
of the iterative methods is given by Hj'IP/II&IP < £• The diagonal scaling was 
carried out in advance. The number of lattice points on each axis was set 
to 10 first and intermittently changed up to 640. The specifications of the 
coefficient matrix and the iterative method are shown in Table [5] Details of the 
computation environment used are shown in Table [31 The results are shown in 
Tablets] and H 



Table 5: Properties of the matrices and the method used. 



n 


nnz / row 


Symmetry 


Solver 


€ 


from 10^ to 640^ 


7 


Yes 


CG 


l.Oc-9 



Table 6: Scalability of the speed-up ratio by A'^ILU(O). 



n 


ILU(O) 


a2ilu(o) 


Speed-up 
ratio 


Itr. 


Time 


Itr. 


Time 


<P 


7 


Itr. 


Time 


10-^ 


21 


1.87e-3 


19 


1.99e-3 


1.38 


1.03 


0.95 


0.94 


20"' 


33 


1.95e-2 


27 


1.70e-2 


1.86 


1.24 


1.22 


1.15 


40^ 


65 


2.57e-l 


39 


1.66e-l 


2.19 


1.38 


1.67 


1.55 


80^ 


127 


3.85e-(-0 


60 


1.93e-(-0 


2.42 


1.48 


2.12 


2.00 


160-' 


254 


5.94e-|-l 


98 


2.39e-|-l 


2.59 


1.55 


2.59 


2.49 


320^ 


503 


9.21e-|-2 


166 


3.12e-|-2 


2.72 


1.59 


3.03 


2.95 


640^ 


1015 


1.59e-|-4 


287 


4.52e-|-3 


2.80 


1.62 


3.54 


3.52 
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Table 7: Reduction of the objective function by A^ILU(O) and its scalability. 



n 


ILU(O) 


A2iLU(0) 


Reduction 
ratio 


10^ 


4.16e+0 


1.56C+0 


0.38 


20-* 


1.44C+1 


3.77C+0 


0.26 


40^ 


4.36e+l 


8.55C+0 


0.20 


80^ 


1.27e+2 


1.86C+1 


0.15 


160^ 


3.66e+2 


3.98e+l 


0.11 


320-^ 


1.04e+3 


8.38e+l 


0.08 


640^ 


2.96C+3 


1.75C+2 


0.06 



From Table [6l the speed-up ratio through the auto-acceleration increases 
with matrix size, eventually exceeding 3.5 times the original performance. The 
condition number of a linear system created by the digitization of second-order 
elliptic PDE, as we focused on in this article, is 0(/i~^) regardless of the order of 
differentiation and the dimension of the space of interest. The order of the num- 
ber of iterations until convergence for CG with no preconditioning is 0{h~^). 
This order estimation is still valid even if a simple preconditioning including 
ILU(O) is applied. The order of condition number reduces to 0{h^^) and the 
number of iterations reduces to 0{h~°'^) when we apply MILU(O) precondi- 
tioning. On the other hand, the A^ILU(O) improves the order of the number 
of iterations to 0{h~°'^^). Since our method is based on MILU methodology 
in constructing the objective function, it inherits the merits of MILU in the 
scalability with respect to the problem size. 

Table [7] shows the objective function of ILU(O) and A'^ILU(O), and the re- 
duction ratio defined as f(R) of A2iLU(0) divided by that of ILU(O). This table 
shows that as the matrix becomes larger, the objective function drastically de- 
creases through the auto-acceleration. The decrease in the objective function 
indicates that the approximation accuracy of the preconditioning matrix is im- 
proved, as explained in Section 3. 

Hence, from Table [6] and [71 we conjecture that in ILU(O) preconditioning, 
the larger the coefficient matrix, the lower the approximation accuracy of the 
preconditioning matrix. However, when the auto-acceleration is applied, the ac- 
celeration parameters are optimized for each preconditioning matrix, regardless 
of the matrix size, thus improving the properties of the preconditioning matrix 
as much as possible. As the matrix becomes larger, therefore, the effects of the 
auto-acceleration also increase, improving the performance of A^ILU(O) precon- 
ditioning relative to ILU(O) preconditioning. For these reasons, we conclude that 
A^ILU(O) preconditioning improves the scalability of ILU(O) preconditioning. 

4.2 Performance evaluation for general sparse matrices 

In this subsection we evaluate the performance of A^ILU preconditioning for 
general linear systems without limiting the method for discretizing PDEs, and 
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validate its effectiveness. In this case, the coefficient matrix for the system of 
equations is an irregular matrix in which non-zero entries are located irregularly. 
We obtained our samples of coefficient matrices from the University of Florida 
Sparse Matrix Collection under the conditions shown in Table [H To ensure the 
reliability of our evaluation, we carried out our validation with as many sample 
matrices as possible. 



Table 8: Criteria for collecting sample matrices. 



Number of rows 


< 600,000 


Number of nonzcros 


< 20,000,000 


Pattern symmetry 


Yes 


Numerical symmetry 


Yes 


Shape 


Square 


Positive definite? 


Either 


2D/3D discretization? 


Yes 


Real or complex? 


Real 


Binary 


No 



For vector b on the right-hand side of linear systems, we assigned values 
such that the solution vector x was a vector each of whose entries was 1. The 
initial value for each of the entries in the solution vector x was 0. In addition 
to the requirements shown in Table [3 we removed all matrices that contain 
zeros on the main diagonal and all matrices in which all the entries have such 
small absolute values that vector b on the right-hand side becomes the zero 
vector; a total of 217 matrices were considered. For the iterative method we 
used the CG method and, for determination of convergence we used the criteria 
IkP/ll^lP ^ l.Oe— 8. The maximum number of iterations was set as the size 
of the coefficient matrix. For each problem we carried out diagonal scaling in 
advance to normalize the diagonal entries (making them 1) in the coefficient 
matrix. Table [9] summarizes the specifications of the computation environment. 



Table 9: Calculation environment. 



System 


Fujitsu PRIMERGY RX200 S3 


CPU 


Intel Xeon 5160 
3.00GHz / L2-Cache 4MB 


Memory 


DDR2-667 8GB 


Compiler 


Intel C-I-+ Ver. 11. 1.075 



When solving systems of linear equations arising from an unstructured grid, 
ILU factorization may result in a diagonal matrix D with some tiny entries, 
which significantly degrade the convergence [Tl]. Shifted ILU preconditioning 
is an effective way to avoid this occurrence and thus is widely used, so we now 
validate the effectiveness of the auto-acceleration on shifted ILU precondition- 
ing. To obtain the best performance of the original shifted ILU, we use a set of 
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candidate values for the parameters. We applied shifted ILU(O) precondition- 
ing and shifted A^ILU(O) preconditioning to every combination of the sample 
matrices and the candidate parameter values, and analyzed the results from the 
following multiple aspects. 



4.2.1 Robustness of shifted A2ILU(0) 

In this subsection we examine how the auto-acceleration affected the conver- 
gence. Figure [TT] shows the total result of convergence determination for shifted 
ILU(O) preconditioning and shifted A^ILU(O) preconditioning. The left graph 
is the result of shifted ILU(O) and the right shifted A'^lL\J{0). The horizontal 
axis indicates the values of shift parameters while the vertical axis indicates 
the number of matrices tallied. The type of convergence is classified into the 
following three: convergent, pseudo-convergent, and not convergent. "Pseudo- 
convergent" refers to a case where the norm of the true residual vector reaches 
its lower bound while the norm of the recursive residual vector continues to 
decrease. In this case, an approximate solution with a desired accuracy is not 
expected to be obtained even if the iteration proceeds. 



Shifted ILU(O) 



210 

200 

w 

<D 

190 

TO 

^ 180 
o 

I 170 
^ 160 
150 
140 



Shifted A2|LU(0) 



Shifted ILU 



Shifted AILU 



I Not convergent 
I Pseudo-convergent 
■ Convergent 



Figure 11: Result of convergence determination. 



As shown in the left half of the figure, the shifted ILU(O) made only 148 
matrices converged where a = 0.0. The number of convergent cases increased 
up to 181 as a increased until a = 0.3. Hence, the results clearly show the effects 
of using shift parameters. However, when a increased beyond 0.3, this tendency 
reversed itself, decreasing the number of convergent cases monotonically. Thus, 
a = 0.3 is the optimum value for this collection of matrices. 

Next, we look at the results of the shifted A^ILU(O). Overall, any reduction 
in the number of convergent cases compared to the shifted ILU(O) is not found 
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for any value of a. When a — 0.4 or above, there is a significant increase, 
avoiding the reduction in the number of convergent cases with an increase of 
the value of a. Hence, the use of the auto-acceleration keeps or enhances the 
robustness of shifted ILU(O) preconditioning against the variety of coefhcient 
matrices. 

4.2.2 Improved convergence rate by auto-acceleration 

In this subsection we further analyze the effects of the auto-acceleration, not only 
on the types of convergence, but also the speed of convergence. In the discussion 
below, we evaluate the computing time by using the number of iterations because 
we found that the computing cost of auto-acceleration can be ignorable for the 
total cost (no more than 1 % for a = 0.2,0.5). 

We have divided up the increase ratio of the number of iterations into several 
classes and shown the number of matrices in each of these classes in Figures [T2l 
andfTSl Here, the increase ratio is defined as {Na — Nj)/Nj, where Na and Nj 
are the number of iterations obtained in shifted A^ILU(O) and shifted ILU(O), 
respectively. Figure [T^ shows the results when the shift parameter is a = 0.2 
while Figure fT3l shows the results at a = 0.5. We assigned the ratio of "Below 
—50%" to a problem that converges by shifted A^ILU(O) but not by shifted 
ILU(O). The inverse case was indicated by "Above -|-50%". We stated "No 
change" if both methods obtained convergence by the same number of iterations 
or if neither of them produced convergence. 




Figure 12: Increase ratio of the iterations through the auto- acceleration (shift 
parameter a = 0.2). 

Figure [T^] shows that when shift parameter a is 0.2, the number of solu- 
tions whose convergence rate improves by the auto-acceleration is 123 (" Below 
—50%" and "—50% - 0%"), which is 56.7% overall. Meanwhile, the convergence 
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Above +50%, 2 




0% - +50%, 



Below -50%, 14 



No change, / 
49 Y 



-50% - 0%, 
139 



Figure 13: Increase ratio of the iterations through the auto-acceleration (shift 
parameter a = 0.5). 



rate remained the same for 69 solutions ("No change"), or 31.8% of the total 
number. The number of solutions whose convergence rate got worse was only 25 
("0% - +50%" and "Above +50%"), which is 11.5% of the total number. The 
application of the auto-acceleration resulted in much more merits than demerits 
in convergence rate as well as robustness. 

Further, Figure [13] shows that, at shift parameter a = 0.5, the percent- 
age in which the convergence rate is improved by the application of the auto- 
acceleration increases to 70.5% while the convergence rate did not change in 
22.6% and got worse in 6.9% (both of these numbers dropped here). Hence, we 
conclude that the auto-acceleration more clearly shows its effectiveness with a 
larger shift parameter value a. 

4.2.3 Effect of shifted A^ILUCO) for the shift parameter 

In Subsections 4.2.1 and 4.2.2, we demonstrated the effectiveness of the auto- 
acceleration on the robustness over the variety of general sparse matrices from 
the standpoint of performance. In this subsection, we further show the effective- 
ness of the auto-acceleration on the robustness over the shift parameter values 
from the standpoint of applicability. Specifically, we address the selection of 
the shift parameter, a major practical challenge in shifted ILU(O), by studying 
what effects the auto-acceleration has on parameter selection and verifying the 
practicality of shifted A^ILU(O). To help us understand this, we show in Figure 
[HI the calculation results for the sample matrix "Rothberg/cfd2" . The hori- 
zontal axis represents the shift parameter while the vertical axis represents the 
number of iterations. 

There are two main effects of using a shift parameter in shifted ILU precon- 
ditioning. One is the positive effect of helping the convergence of solutions that 
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Shifted ILU(O) 
■ Shifted A2|LU(0) 



0.0 0.1 0.2 0.3 0.4 0.5 



Shift parameter 

Figure 14: Effects of auto-acceleration in "Rotliberg/cfd2" . 



would otherwise not converge. Figure [Ml shows this effect when a — 0.1. The 
other is the negative effect of gradually reducing the convergence rate. This is 
shown by the increase of the number of iterations when a exceeds 0.1. Because 
of these two opposing effects, the responsiveness of shifted ILU(O) to the shift 
parameter value is quite sensitive. In contrast, with shifted A^ILU(O) precondi- 
tioning, only the latter effect, namely the negative effect of reduced convergence 
rate when a exceeds 0.1, is kept low. So the robustness of shifted A^ILU(O) 
against the shift parameter value is improved. We examine this point below. 

The positive effect of the shift parameter exists for the following reason. 
In ILU preconditioning, the convergence is significantly degraded if, during ILU 
factorization, a tiny entry appears in diagonal matrix D. Shifted ILU avoids this 
problem by enlarging the diagonal entries of the matrix being ILU factorized. As 
a result, linear systems whose solutions are not supposed to converge can have 
convergent solutions. The drastic increase, seen in Figure 111! in the number 
of convergent cases for a — 0.0 - 0.2 is thought to be caused by this effect. 
On the other hand, the auto-acceleration process does not change the sign of 
each entry of diagonal matrix D (even though it scales the matrix), and so does 
not prevent this effect. Therefore, shifted A^ILU(O) has proved as effective as 
shifted ILU(O), a fact revealed by Figures [T4l and ITT] 

Next, the negative effect of the shift parameter exists for the following reason. 
In shifted ILU, the larger the parameter is, the farther apart the matrix being 
ILU factorized becomes from coefficient matrix A. This reduces the accuracy 
of the preconditioning matrix, also causing convergence rate to decrease. The 
decrease in the number of convergent cases when a is 0.3 or greater, shown 
in Figure [HI is thought to be caused by this effect. On the other hand, the 
auto-acceleration process improves this accuracy of the preconditioning matrix 
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by the acceleration parameters, reducing this negative effect. In particular, 
because the acceleration parameters are automatically optimized in accordance 
with the remainder matrix, this effect becomes more pronounced as the shift 
parameter increases. This fact is supported by the results of Figures [T2l and [T3l 
that the larger the value of a, the higher the percentage where the convergence 
rate improved. Consequently, as Figures [T3] and [11] show, shifted A^ILU(O) 
maintains high performance even when the shift parameter takes larger values 
than the optimum one. For these reasons, we conclude that the auto-acceleration 
cancels only the negative effect of the shift parameter while smoothing out the 
responsiveness to shift parameter increases. 

With this stated, we consider the practicality of shifted ILU(O) and shifted 
A^ILU(O). Shifted ILU is often used to avoid the breakdown caused by tiny 
diagonal elements. As described above, a user needs to find the minimum value 
of a that avoids the breakdown. To do this, shifted ILU(O) requests a user to 
perform a line search along the value of a, in which one should check whether the 
breakdown is occurs at each point on the line. This brute force optimization 
consumes obviously extremely huge computational cost. Shifted A^ILU(O) is 
expected to request no such brute force optimization of a because A^ inhibits 
the negative effects. Therefore, shifted A^ILU(O) tremendously improves the 
practicality of shifted ILU(O). 

5 Conclusions 

In this paper we have proposed auto-accelerated ILU preconditioning (A^ILU 
preconditioning) , which improves performance without losing the practicality of 
ILU preconditioning. A^ILU preconditioning is a process in which new acceler- 
ation parameters are incorporated in ILU preconditioning and these parameters 
are automatically optimized. Previous ILU-based methods all have practical- 
ity issues due to the fact that their own parameters must be set up by users. 
In contrast, A^ILU(O) preconditioning is highly practical because, like ILU(O) 
preconditioning, it is a "parameter-free" method for users. 

We verified the following merits of A^ILU(O) preconditioning by using sys- 
tems of linear equations arising from physical simulations based on rectangular 
grids. 

• For five sample problems with the coefficient matrices of hundred thou- 
sands dimension, A^ILU(O) preconditioning is 1.65 times as fast as ILU(O) 
preconditioning on average. Even compared with other ILU-based meth- 
ods in which their original parameters are optimized manually, this speed 
is still higher. We concluded that A^ILU(O) is superior to the previous 
ILU-based methods with respect to both practicality and performance. 

• Its scalability relative to the size of the coefficient matrix is superior to 
ILU(O) preconditioning. It is concluded that the number of iterations is 
Q(^}i^o.65^ which is less than 0{h^^) for ILU(O), where h denotes the mesh 
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size. Once the matrix size exceeds 260 million, its speed exceeds 3.5 times 
that of ILU(O) preconditioning. 

Furthermore, the proposed auto-acceleration was applied to previous major 
ILU-based methods with the following confirmed merits: 

• For shifted ILU(O), modified ILU(O), and fill-in level ILU, regardless of 
the value of the parameter unique to these methods, the performance 
improves. 

• For Grout ILU, the performance improves even in the case when a denser 
preconditioned matrix is generated and therefore the original Grout ILU 
shows good performance. 

• For shifted ILU(O) and modified ILU(O), because the performance is stable 

with respect to any change in the unique parameter, the burden of setting 
up the parameter is reduced so that its practicality is improved. 

For general sparse matrices, we have shown the effectiveness of shifted A^ILU(O). 
We evaluated the performance of shifted A^ILU(O) preconditioning using over 
200 general sparse matrices obtained from the University of Florida Sparse Ma- 
trix GoUection. The results confirmed the following merits: 

• There is no reduction in the number of convergent cases compared with 
shifted ILU(O) preconditioning overall the shift parameters examined in 
this paper. In addition, when the shift parameter is beyond 0.3, the auto- 
acceleration increased the number of convergent cases of shifted ILU(O). 

• Much more cases improve the convergence rate, rather than worsen, com- 
pared with shifted ILU(O) preconditioning. 

• Even if the value of the shift parameter is raised to ensure convergence, 
the convergence rate does not drop significantly, unlike shifted ILU(O) 
preconditioning. Hence, this method is able to maintain both safety and 
performance, and is thus more practical. 
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